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EDF  STATISTICS  FOR  GOODNESS -OF-FIT:  PART  I 

by 

M.  A.  Stephens 

1.  ■  Introduction. 

The  goodne3s-of~fit  problem  Is  as  follows:  given  a  random  sample 

xl’x2>  *  *  to  H0:  the. .sample  comes  from  a  population  with 

distribution  function  F(x).  The  classical  test  for  this  problem  is 
2 

the  X  -test,  which  has  several  advantages:  (a)  it  is  well-adapted 
for  the  case  when  F(x)  is  discontinuous,  i.e.,  represents  a  discrete 
distribution,  and  (b)  it  is  known  (at  least  to  a  good  approximation) 
how  to  adapt  the  statistic  for  the  case  when  parameters  of  F(x)  must 
themselves  be  estimated  from  the  sample. 

This  paper  deals  with  another  class  of  goodness-of-fit  statistic— 
EDF  statistics,  so-called  because  they  are  based  on  a  comparison  of 
F(x)  with  the  empirical  distribution  function  Fr(x).  For  the  case 
when  F(x)  is  continuous  and  completely  specified  (Case  0  below)  it 
has  been  long  known  that,  in  general,  EDF'.  statistics  give  more 

p 

powerful  tests  of  Hq  than  X  :  the  disadvantage  is  that  they  are  not 
well-adapted  for  discrete  distributions,  nor  for  the  case  when  para¬ 
meters  must  be  estimated  from  the  sample.  This  last  drawback  has 
undoubtedly  prevented  their  wider  application  in  practice,  together 
with  the  fact  that  they  are  relatively  difficult  to  compute .  Recent 
work  has  now  made  it  possible  to  use  these  statistics  very  easily  in 
Case  0,  and  also  for  two  very  important  practical  situations-when  the 
distribution  tested  is  normal,  or  exponential,  with  parameters  to  be 
estimated,  and  power  studies  suggest  they  should  be  brought  into  wider 
use. 
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In  this  report  we ' concentrate  on  a  practical  guide  to  the  use  of 
EDF  statistics*  specifically,  those  usually  called  D+,  D  ,  D,  W2,  V, 

TJ2,  A.  Areuffix  is  often  added  to  represent  sample  size,  but  this  will 
be  omitted. 

Once  a  test  statist:’  has  been  calculated,  a  table  is  entered  to 
make  the  test.  The  choice  of  table  depends  on  what  is  known  of  F(x), 
so  this  is  classified  first,  in  section  2.  The  formulas  and  procedures 
are  in  sections  3  and  4.  Comments  on  the  tables  and  computational 
details  are  in  sections  5  and  6,  and  Part  1  ends  with  some  general 
observations  on  power  and  choice  of  statistic. 

2.  Knowledge  of  F(x). 

The  tables  to  be  used  with  the  statistics  depend  on  knowledge  of 
F(x),  classified  as  follows. 

(a)  Case  0;  F(x)  continuous,  completely  specified.  This  is  the 
classical  case,  and  tables  of  significance  points  for  all  the 
statistics  exist  in  the  literature.  For  references  see  Stephens 

( 1970b).  The  use  of  Table  0  as  described  below  permits  us  to  dispense 
with  these  tables. 

2 

(b)  Case  1:  F(x)  is  the  normal  distribution,  a  known,  n  estimated 

by  x. 

p 

(c)  Case  2:  F(x)  is  the  norms, 1  distribution,  known,  a 

estimated  by  [(xi-x)2/(n-l)]. 

(d)  Case  3:  F(x)  is  the  normal  distribution,  both  and  o2 

unknown,  estimated  as  above* 
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(.e)  Case  4:  F(x)  is  the  exponential  distribution,  i.e., 

F(x)  =  l-exp( -0x),  $  estimated  by  l/x. 

In  the  case  of  normality,  Case  3  is  the  important  practical  situation, 
though  Case  2  sometimes  arises,  e.g.  in  regression  analysis,  when  p 
is  known  to  be  zero. 

3*  Test  procedures. 

The  goodness -of -fit  test  takes  the  following  steps: 

(a)  When  necessary,  parameters  are  estimated  from  the  sample,  as 
described  above. 

(b)  The  values  of  x^Xg,  ...,xn  are  assumed  to  be  in  ascending 
order;  then  calculate  z i  =  F^),  for  i=l,2, . .  .,n,  where  F(x) 
may  contain  estimated  parameters  for  Cases  1  to  4.  Th-:n 

z,  <  zD  <  ...  <  z  . 

(c)  The  desired  statistic  is  calculated  as  described  below: 

Suppose  we  call  it  T.  The  appropriate  Table  I  is  entered, 

(corresponding  to  Case  i)  and  T*,  the  modified  T,  is  found  from  the 
expression  given;  then  T*  is  referred  to  the  adjoining  set  of 
significance  points  to  make  the  test. 

The  test  given  is  the  usual  upper  tail  test;  on  occasion  the 
lower  tail  may  have  to  be  used  (see  section  8.3,  and  Seshadri,  Csorgo 
and  Stephens  (1969))* 
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4.  Calculation  of  statistics. 

(a)  The  Kolmogorov  statistics  D+>  D  y  D. 

D+  =  max  (~  -  z. )  ;  D  =  max  (z.  -  • 

l<i<n  n  l<i<n 


(c)  The  Kuiper  statistic  V 


V  =  D  +  D” 


(d)  The  Watson  statistic  \f 


2  n 

U2  =  W^-nU  -  where  z  =  £  zJn  • 
c  1=1  1 


(e)  The  Anderson-Darling  statistic  A 


n 

A  *  -(  £  (2i-l)Un  z.  +  £n(l-z  )))/n-n 
i=l  u  1 


When  the  statistic  is  calculated,  use  Table  i  for  Case  i:  Hq  is 
rejected  if  the  statistic  exceeds  the  point  given  at  the  chosen  level 
of  significance. 


Points  on  a  circle •  Although  they  can  be  used  also,  like  the  other 


statistics,  for  points  on  a  line,  the  statistics  u  and  V  were 
introduced  for  points  on  a  circle.  Only  these  two  statistics  should 
be  calculated  for  such  points,  and  any  suitable  origin  my  be  used; 
the  other  statistics  will  take  different  values  according  to  choice  of 
origin. 

Illustration  1.  Suppose  F(x)  is  completely  specified,  and  D  is 
.27  for  25  observations.  Then,  in  Table  0,  the  modified  D  is 

D*  =  .27(5  +  0.12  +  0.11/5) 

=  1.388  . 


Reference  to  the  table  of  significance  points  for  D*  in  Table  0 
shows  D*  to  be  significant  at  the  5ft  level. 


Illustration  2.  A  test  is  made  that  20  observations  are  from  a  normal 
population  with  mean  and  variance  unknown.  The  sample  gives  x  and 
s  =  Z  (x^-x)  /(n-l)  •  For  each  x^,  it  is  convenient  first  to  find 
v  a  (x  -x)/s  and  then  z.  a  -  /  *  exp(-t  /2)dt.  Using  the  z.  as  above, 

1  1  1  J2*  1 

suppose  W2  is  found  to  be  .05** •  In  Table  3,  W*  is  W(l  +  0.5/n)  ® 
.Q5k(kl/ko)  =  .055.  This  is  not  greater  than  0.091,  i.e.  not 
significant  at  the  15ft  level. 


5.  Table  8 . 

Table  A  contains  Tables  0,  3,  **  for  the  three  most  practical 
cases.  Table  B  contains  tables  for  Cases  1  and  ?. 
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Table  0 ,  with  A  added;  comes  from  Stephens  ( 1970b)}  note  the 
different  meaning  for  A  in  that  paper.  The  Anderson-Darling  statistic, 
in  Case  0,  converges  so  rapidly  that  no  modification  is  needed  in  any 
realistic  situation  (n  >  5):  see  Marshall  (1958)  and  Table  B,  (Table  6) 
for  Monte  Carlo  studies  by  the  author.  In  Tables  1-4,  the  asymptotic 
points  for  w2,  o2  and  A  have  been  calculated  theoretically  (Stephens, 
1971).  For  finite  n,  significance  points  from  Monte  Carlo  studies, 
mostly  based  on  10,000  samples  for  each  of  many  values  of  n,  then 
smoothed,  have  been  used  to  calculate  the  modifications.  (Stephens 
1969,  1970a  contain  original  %  and  1$  points  for  all  except  A}  for 
completeness  these  points,  added  later,  are  now  given  in  Table  B). 

Other  workers  have  found  points  for  some  statistics  as  indicated: 

Lilliefurs  (1967,  19&9),  D}  van  Soest  (1967),  D,  W2}  Koerts 
and  Abrahamae  (1969),  V.  The  points  agree  well  with  those  given  by 
use  of  Table  A,  except  for  some  differences  in  estimates  of  asymptotic 
points  for  D  and  V.  Those  given  here  are  based  on  larger  samples 
(n  up  to  100}  other  authors  have  n  <  4o)  but  in  any  event  the 
practical  difference  is  negligible. 

For  Case  2,  some  Monte  Carlo  points  and  asymptotic  points  are 
given  in  Table  B,  Table  2.  but  no  modifications  have  been  calculated. 

For  Ca3e  1,  the  moat  unlikely  situation,  only  theoretically  calculated 
asymptotic  points  are  known.  (Table  l). 

6.  Computing  details. 

(a)  The  modifications  to  the  well-known  statistics  were  made  in  order 
to  dispense  with  extensive  tables}  then  computer  subroutines  can  easily 
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be  written  to  calculate  the  modified  statistics  in  a  given  Case,  and, 
for  that  Case,  to  print  out  the  appropriate  set  of  significance  points 
so  that  the  user  can  mahe  his  test.  Such  a  routine  is  available 
(FORTRAN)  from  the  author. 

(b)  At  one  time  it  seemed  desirable  to  approximate  the  set  of 

significance  points  by  distributions  of  the  form  a  +  bX2  so  that 

P 

a  modified  statistic  T*  could  be  used  to  calculate  a  further 

modification  T**  =  (T*-a)/b,  and  the  program  would  print  out  T** 

and  p  with  an  instruction  to  compare  with  the  X2  distribution. 

P 

For  practical  use,  p  would  need  to  be  an  integer.  Even  with  this 
limitation,  excellent  approximations  were  found,  (Stephens  1969,  1970a) 
and  the  values  of  a,  b  and  p,  for  Cases  0,  3  and  4,  are  in  Table  C. 

P°wer  comparisons  ~  generel  conclusions. 

We  end  Part  1  with  a  resume  of  the  power  situation,  based  on  the 

comparisons  given  In  detail  in  Part  2.  For  all  three  practical  cases, 

EDF  statistics  compare  excellently  with  oth«r  goodncss-of-m  statistics 

the  only  serious  rival  being  W  (see  below)  for  Case  3.  On  the  whole. 

2  2  1,1  11  ■■ 

A,  »  and  IT  are  recommended.  Each  case  is  now  considered  in  turn. 

Casc.0.  If  p(x)  i8  completely  specified,  the  z  should  be  uniformly 
distributed  between  0  and  l,  written  u(0,l).  Power  studies  have 
therefore  been  confined  to  a  test  of  this  hypothesis  concerning  «, 
when  the  z  are  in  fact  drawn  from  alternative  distributions.  If  the 
variance  of  the  hypothesised  F(x)  is  correct,  but  the  mean  Ss  wrong, 
the  points  z  will  tend  to  move  toward  0  or  1:  if  the  mean  is 
correct,  but  the  variance  wrong,  the  points  will  move  to  each  end,  or 
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will  move  towards  0*5-  D  and  w  tend  to  judge  the  same  samples 

significant,  and  V  and  l f',  the  first  pair  will  detect  the  change 

in  mean  better,  and  the  second  pair  will  detect  the  change  in  variance. 

tends  to  be  better  than  D,  and  TJ^  slightly  better  than  V* 

2 

the  best  pair  is  always  better  than  the  X  test,  Thus  in  practice 
it  would  seem  always  worth  while  looking  at  and  l/-.  Historically, 
D  has  been  the  most  used  EDF  statistic,  but  it  tends  to  be  the  least 
powerful,  overall,  for  the  four.  Unfortunately,  for  this  Case,  few 
results  exist  for  A.  For  references  to  earlier  work  on  Case  0,  see 
Kendall  and  Stuart,  Vol.  2  (1961), 

Case  } .  For  this  case,  many  tent  statistics  have  been  proposed  in  the 

past.  Tfc*.  EDF  statistics,  with.  A  in  the  lead,  generally  behave  much 

2 

better  than  all  of  them,  including  X”.  Another  recently  Introduced 
statistic,  W,  (Shapiro  and  Wllk,  19^9 )  has  power  comjjarable  to  that  of 
A,  possibly  slightly  greater,  but  rot  overwhelmingly  so,  as  earlier 
reported.  It  has  soma  disadvantages  in  the  ease  with  which  the  test 
can  be  made  (see  section  8.2),  The  results  for  EDF  statistics, 
particularly  A,  and  W  are  very  highly  correlated,  and  it.  would  be 
interesting  to  see  this  connect  lor,  exp  lored  further. 

Case  '■*  ♦  For  this  case  also  many  tests  have  beer,  proposed.  We  have 

investigated  the  Case  h  procedure  and  Thre*  oth*  transformations, 

each  of  which  produces  values  a  which  ssust  then  be  tested  for 

2 

uniformity.  On  the  whole,  w  or  A,  with  Case  kf  sees  to  be  best 
omnibus  statistics,  though  further  work  needs  to  be  done. 
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Other  considerations.  With  the  existence  of  modern  computers,  there 
is  a  temptation  to  investigate  existing  statistics,  or  invent  new 
ones  and  investigate  them,  by  Monte  Carlo  methods.  Part  2  is  full 
of  such  studies.  Nevertheless,  this  can  be  a  risky  procedure,  since 
it  is  easy  to  make  mistakes,  and  yet  not  know  it.  Most  checks  can  only 
be  made  by  someone  else  repeating  the  experiment;  most  of  the  results 
in  this  paper  have  been  so  checked,  except  for  some  of  the  power 
studies.  Apart  from  aesthetic  reasons,  the  more  mathematical  results 
that  can  be  produced  to  support  Monte  Carlo  work  the  better.  In  connec¬ 
tion  with  producing  significance  points,  mathematical  work  can  be  and 
has  been  done  on  W2,  U2  and  A  in  Cases  1  to  k,  to  get  reliable 
asymptotic  percentage  points,  and  the  statistics  all  converge  rapidly 
to  their  asymptotic  distributions;  similar  work  has  not  yet  been  done 
for  D,  V.  If  we  add  the  good  overall  power  properties  of  W2,  A,  and 
U2,  and  their  ease  of  computation,  it  would  seem  that  they  should  be 
brought  into  greater  use. 


This  work  was  supported  by  the  National  Research  Council  of  Canada,  and 
also  by  the  CJ.S .  Office  of  Naval  Research,  Contract  No.  NOOOlk-67-A- 
0112-C053.  The  author  expresses  thanks  to  both  these  agencies. 
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TABLE  A 


Modifications  to  Dj  Ys  £  £  a 


Table  0.  Modifications  for  the 

test  when  F(x)  is.  completely  known 

Statistic  T 

Modified  form  T* 

Percentage  points  for  T* 

£  15.0 

10.0 

5-0 

2.5 

1.0 

d+  (if) 

D+(y/n  +0.12  +  0.1l//n ) 

0.973 

1.073 

1.224 

1.358 

1.518 

D 

D(  /n  :  0.12  +  0.11//&) 

1.138 

1.224 

1.358 

1.48o 

1.628 

V 

V(/7  + 0.155  +0.24//n) 

1=537 

1.620 

1.747 

1.862 

2.001 

W2 

(W2  -0.4/n  +  0.6/n2)(l.0+  l.o/n) 

0.284 

0.347 

0.461 

0.581 

0.745 

u2  j 

(U2  -0.Vn  +  0.Vn2)(!.0  +  0.8/n) 

0.131 

0.152 

O.187 

0.221 

0.267 

A  ! 

For  axl  >  5: 

1.6l 

1.933 

2.492 

3.070 

3.857 

Table  3*  Modifications  for  a  t 

2 

est  for  normality.,  u  and  a  unknown 

Statistic  T 

Modified  form  T* 

Percentage  points  for  T* 

$  15.0 

10.0 

5-0 

2.5 

1.0 

D 

D(,/n  -  0.01  +  0.85,//n) 

0.775 

0.819 

0.895 

0.955 

1.035 

v 

V(,/n  +  0.05  +  0.82//n) 

1.320 

1.386 

1.489 

1.585 

1.693 

w2 

W*(l  +  0.5/n) 

0.091 

0.104 

0.126 

0.148 

0.178 

u2 

U^l  +  0.5/n) 

O.vJ.5 

0.096 

0.116 

0.136 

0.163 

A 

A  (1  +  4/n  -  25/ n2) 

0.57b 

0.656 

0.787 

0.918 

1.092 

Table  4.  Modifications  for  a 

test  for 

exponentLality, 

9  unknown 

Statistic  T 

Modified  form  T* 

Percentage  points  for  T* 

i 

f  13. 0 

10.0 

5.0 

2.5 

1.0 

D 

(D-  0.2/n)(/n+0.26  +  0. 5//n) 

0.926 

0.990 

1.094 

1.190 

1.508 

V 

(V  -0.2/n)(v/n+0.24  +  0.35//n) 

1.445 

1  r  ">7 
-  | 

1.655 

1.774 

1.910 

w2 

^(1+  0»l6/n) 

0.149 

0.177 

0.224 

0.275 

0.337 

u2 

1^(1  +  0.l6/n) 

0.112 

0.130 

0.l6l 

0.191 

0.230 

A 

A(l  +  0.6/n) 

0.922 

1  078 

1.341 

I.606 

1.957 

A 


TABLE  B 


All  asymptotic  points  in  Tables  lt  2,  6  for  W2,  t)2,  A  are  theoretically 

derived  (Stephens,  1971) 


TABLE  1 


Asymptotic  points  for  W2,  U2, 

A,  Case 

1 

Significance  level  (#):  15  10 

5 

2.5 

1 

w2 

0.355 

.0.165 

0.196 

0  .237 

u2 

.128 

.157 

.187 

.227 

A 

.908 

1.105 

1.304 

1.575 

TABLE  2 


Significance  points  for  Case  2.  (Monte  Carlo  results  for  D>  V). 


Statistic: 

Percentage  level  (#): 

n 

15 

10 

5 

2.5 

1 

/n  D 

10 

1.050 

1.358 

I.270 

1.380 

1.530 

20 

1.070 

I.160 

I.290 

1.415 

1-570 

50 

1.080 

1.170 

1.310 

1.432 

1.595 

100 

1.100 

I.180 

1.320 

1.44o 

I.610 

00 

1.120 

1.190 

1.555 

1.455 

1.625 

y/n  V 

10 

1.505 

1.585 

1.500 

1.595 

1.710 

20 

1.545 

1.410 

1.535 

1.642 

1.770 

50 

1.380 

1.450 

1.570 

1.680 

I.810 

100 

1.590 

1.470 

1.590 

1.697 

1.825 

00 

1.410 

1.490 

1.612 

1.720 

1.845 

w2 

u2 

all  n 

.529 

.443 

.562 

•723 

all  n 

.323 

.153 

.182 

.221 

A 

all  n 

1.760 

2.323 

2.904 

3.690 

15 


TABLE  B.  (Cont.) 


TABLE  6 

Monte 

Carlo  points  for  A:  Case  0} 

Case  3j 

,  Case  4 

n 

Percentage  level  ($):  15 

10 

5 

2.5 

1 

Case  0 

5 

1.63 

1.94 

2.54 

3.09 

3.97 

» 

1-933 

2.49$ 

3.020 

3.857 

Case  3 

10 

.514 

•578 

.683 

•779 

.926 

20 

.528 

•591 

•704 

.815 

.969 

50 

.546 

.616 

•735 

.861 

1.021 

100 

•559 

.631 

.754 

.884 

1.047 

00 

•  576 

.656 

•  787 

.918 

1.092 

Case  4 

10 

.887 

1.022 

1.265 

1.515 

1.888 

20 

.898 

1.045 

1.300 

1.556 

1.927 

-50 

•911 

1.062 

1.323 

1.582 

1.945 

.916 

1.070 

1.330 

1.595 

1.951 

00 

.922 

1.078 

1.341 

1.606 

1.957 

TABLE  C 

.Values  of  a,  b,  p  for  an  approximation  of  type 
o 

a  +  bX  ,  to  significance  ppints  in  Table  A 

£  .  — ■  i—  -  ■ -  ■  - - 


Statistic 

Case 

a 

b 

P 

d+,d" 

0 

(2 D+)2 

is  X2 

distributed 

D 

0 

0.1343 

0. 049 

15 

V 

0 

.178 

..  .0558 

30  . 

w2 

0 

.061 

.105 

1 

u2 

0 

.031 

.02  6 

2 

D 

3 

0.115 

0.022 

23 

V 

3 

-0.251 

.022 

60 

w2 

3 

.0187 

.0136 

3 

u2 

3 

.0114 

.0111 

4 

A 

3 

.212 

.095 

2 

D 

4 

0.017 

0.0343 

20 

V 

4 

-.336 

.0295 

50 

w2 

4 

.046 

.0466 

1 

u2 

4 

.0265 

.02 66 

2 

A 

4 

.454 

.231 

A' '***' 
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